skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Collis, Scott"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. According to the World Health Organization, healthy communities rely on well-functioning ecosystems. Clean air, fresh water, and nutritious food are inextricably linked to ecosystem health. Changes in biological activity convey important information about ecosystem dynamics, and understanding such changes is crucial for the survival of our species. Scientific edge cyberinfrastructures collect distributed data and process it in situ, often using machine learning algorithms. Most current machine learning algorithms deployed on edge cyberinfrastructures, however, are trained on data that does not accurately represent the real stream of data collected at the edge. In this work we explore the applicability of two new self-supervised learning algorithms for characterizing an insufficiently curated, imbalanced, and unlabeled dataset collected by using a set of nine microphones at different locations at the Morton Arboretum, an internationally recognized tree-focused botanical garden and research center in Lisle, IL. Our implementations showed completely autonomous characterization capabilities, such as the separation of spectrograms by recording location, month, week, and hour of the day. The models also showed the ability to discriminate spectrograms by biological and atmospheric activity, including rain, insects, and bird activity, in a completely unsupervised fashion. We validated our findings using a supervised deep learning approach and with a dataset labeled by experts, confirming competitive performance in several features. Toward explainability of our self-supervised learning approach, we used acoustic indices and false color spectrograms, showing that the topology and orientation of the clouds of points in the output space over a 24-h period are strongly linked to the unfolding of biological activity. Our findings show that self-supervised learning has the potential to learn from and process data collected at the edge, characterizing it with minimal human intervention. We believe that further research is crucial to extending this approach for complete autonomous characterization of raw data collected on distributed sensors at the edge. 
    more » « less
    Free, publicly-accessible full text available November 1, 2025
  2. Abstract. There is a continuously increasing need for reliable feature detection and tracking tools based on objective analysis principles for use with meteorological data. Many tools have been developed over the previous 2 decades that attempt to address this need but most have limitations on the type of data they can be used with, feature computational and/or memory expenses that make them unwieldy with larger datasets, or require some form of data reduction prior to use that limits the tool's utility. The Tracking and Object-Based Analysis of Clouds (tobac) Python package is a modular, open-source tool that improves on the overall generality and utility of past tools. A number of scientific improvements (three spatial dimensions, splits and mergers of features, an internal spectral filtering tool) and procedural enhancements (increased computational efficiency, internal regridding of data, and treatments for periodic boundary conditions) have been included in tobac as a part of the tobac v1.5 update. These improvements have made tobac one of the most robust, powerful, and flexible identification and tracking tools in our field to date and expand its potential use in other fields. Future plans for tobac v2 are also discussed. 
    more » « less
  3. Abstract Accurate cloud type identification and coverage analysis are crucial in understanding the Earth’s radiative budget. Traditional computer vision methods rely on low-level visual features of clouds for estimating cloud coverage or sky conditions. Several handcrafted approaches have been proposed; however, scope for improvement still exists. Newer deep neural networks (DNNs) have demonstrated superior performance for cloud segmentation and categorization. These methods, however, need expert engineering intervention in the preprocessing steps—in the traditional methods—or human assistance in assigning cloud or clear sky labels to a pixel for training DNNs. Such human mediation imposes considerable time and labor costs. We present the application of a new self-supervised learning approach to autonomously extract relevant features from sky images captured by ground-based cameras, for the classification and segmentation of clouds. We evaluate a joint embedding architecture that uses self-knowledge distillation plus regularization. We use two datasets to demonstrate the network’s ability to classify and segment sky images—one with ∼ 85,000 images collected from our ground-based camera and another with 400 labeled images from the WSISEG database. We find that this approach can discriminate full-sky images based on cloud coverage, diurnal variation, and cloud base height. Furthermore, it semantically segments the cloud areas without labels. The approach shows competitive performance in all tested tasks,suggesting a new alternative for cloud characterization. 
    more » « less
  4. Abstract. Phase correlation (PC) is a well-known method for estimating cloud motion vectors (CMVs) from infrared and visible spectrum images. Commonly, phase shift is computed in the small blocks of the images using the fast Fourier transform. In this study, we investigate the performance and the stability of the blockwise PC method by changing the block size, the frame interval, and combinations of red, green, and blue (RGB) channels from the total sky imager (TSI) at the United States Atmospheric Radiation Measurement user facility's Southern Great Plains site. We find that shorter frame intervals, followed by larger block sizes, are responsible for stable estimates of the CMV, as suggested by the higher autocorrelations. The choice of RGB channels has a limited effect on the quality of CMVs, and the red and the grayscale images are marginally more reliable than the other combinations during rapidly evolving low-level clouds. The stability of CMVs was tested at different image resolutions with an implementation of the optimized algorithm on the Sage cyberinfrastructure test bed. We find that doubling the frame rate outperforms quadrupling the image resolution in achieving CMV stability. The correlations of CMVs with the wind data are significant in the range of 0.38–0.59 with a 95 % confidence interval, despite the uncertainties and limitations of both datasets. A comparison of the PC method with constructed data and the optical flow method suggests that the post-processing of the vector field has a significant effect on the quality of the CMV. The raindrop-contaminated images can be identified by the rotation of the TSI mirror in the motion field. The results of this study are critical to optimizing algorithms for edge-computing sensor systems. 
    more » « less
  5. Abstract The scientific community has expressed interest in the potential of phased array radars (PARs) to observe the atmosphere with finer spatial and temporal scales. Although convergence has occurred between the meteorological and engineering communities, the need exists to increase access of PAR to meteorologists. Here, we facilitate these interdisciplinary efforts in the field of ground-based PARs for atmospheric studies. We cover high-level technical concepts and terminology for PARs as applied to studies of the atmosphere. A historical perspective is provided as context along with an overview of PAR system architectures, technical challenges, and opportunities. Envisioned scan strategies are summarized because they are distinct from traditional mechanically scanned radars and are the most advantageous for high-resolution studies of the atmosphere. Open access to PAR data is emphasized as a mechanism to educate the future generation of atmospheric scientists. Finally, a vision for the future of operational networks, research facilities, and expansion into complementary radar wavelengths is provided. 
    more » « less
  6. Cloud cover estimation from images taken by sky-facing cameras can be an important input for analyzing current weather conditions and estimating photovoltaic power generation. The constant change in position, shape, and density of clouds, however, makes the development of a robust computational method for cloud cover estimation challenging. Accurately determining the edge of clouds and hence the separation between clouds and clear sky is difficult and often impossible. Toward determining cloud cover for estimating photovoltaic output, we propose using machine learning methods for cloud segmentation. We compare several methods including a classical regression model, deep learning methods, and boosting methods that combine results from the other machine learning models. To train each of the machine learning models with various sky conditions, we supplemented the existing Singapore whole sky imaging segmentation database with hazy and overcast images collected by a camera-equipped Waggle sensor node. We found that the U-Net architecture, one of the deep neural networks we utilized, segmented cloud pixels most accurately. However, the accuracy of segmenting cloud pixels did not guarantee high accuracy of estimating solar irradiance. We confirmed that the cloud cover ratio is directly related to solar irradiance. Additionally, we confirmed that solar irradiance and solar power output are closely related; hence, by predicting solar irradiance, we can estimate solar power output. This study demonstrates that sky-facing cameras with machine learning methods can be used to estimate solar power output. This ground-based approach provides an inexpensive way to understand solar irradiance and estimate production from photovoltaic solar facilities. 
    more » « less
  7. Abstract A multi-agency succession of field campaigns was conducted in southeastern Texas during July 2021 through October 2022 to study the complex interactions of aerosols, clouds and air pollution in the coastal urban environment. As part of the Tracking Aerosol Convection interactions Experiment (TRACER), the TRACER- Air Quality (TAQ) campaign the Experiment of Sea Breeze Convection, Aerosols, Precipitation and Environment (ESCAPE) and the Convective Cloud Urban Boundary Layer Experiment (CUBE), a combination of ground-based supersites and mobile laboratories, shipborne measurements and aircraft-based instrumentation were deployed. These diverse platforms collected high-resolution data to characterize the aerosol microphysics and chemistry, cloud and precipitation micro- and macro-physical properties, environmental thermodynamics and air quality-relevant constituents that are being used in follow-on analysis and modeling activities. We present the overall deployment setups, a summary of the campaign conditions and a sampling of early research results related to: (a) aerosol precursors in the urban environment, (b) influences of local meteorology on air pollution, (c) detailed observations of the sea breeze circulation, (d) retrieved supersaturation in convective updrafts, (e) characterizing the convective updraft lifecycle, (f) variability in lightning characteristics of convective storms and (g) urban influences on surface energy fluxes. The work concludes with discussion of future research activities highlighted by the TRACER model-intercomparison project to explore the representation of aerosol-convective interactions in high-resolution simulations. 
    more » « less
    Free, publicly-accessible full text available August 4, 2026